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Abstract 

Upon compressing perceptually relevant signals, conventional quantization generally results in unnat- 
ural outcomes at low rates. We propose distribution preserving quantization (DPQ) to solve this problem. 
DPQ is a new quantization concept that confines the probability space of the reconstruction to be identical 
to that of the source. A distinctive feature of DPQ is that it facilitates a seamless transition between signal 
synthesis and quantization. A theoretical analysis of DPQ leads to a distribution preserving rate-distortion 
function (DP-RDF), which serves as a lower bound on the rate of any DPQ scheme, under a constraint 
on distortion. In general situations, the DP-RDF approaches the classic rate-distortion function for the 
same source and distortion measure, in the limit of an increasing rate. A practical DPQ scheme based 
on a multivariate transformation is also proposed. This scheme asymptotically achieves the DP-RDF for 
i.i.d. Gaussian sources and the mean squared error. 
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On Distribution Preserving Quantization 

I. Introduction 

Quantization is an integral component in the lossy coding of perceptually relevant signals, e.g., audio 
and video. Conventional quantization seeks the optimal trade-off between the rate and a (perceptual) dis- 
tortion measured on a pair of realizations in the sample space. Such a paradigm allows the reconstruction 
to converge to the source as the rate increases, thus achieving the best possible quality. However, at certain 
low rates, merely optimizing the rate against the distortion, compared to other strategies, e.g., synthesizing 
the signal from a model, can lead to unnatural reconstruction. Examples include the following facts that 
are widely known about practical quantizers: 

• The reconstruction is discrete-valued. In image coding, the discrete nature of the reconstruction 
severely affects the rendering quality. A popular remedy is halftoning (H, which attempts to transform 
discretized images into continuous-tone images; 

• The reconstruction has limited bandwidth. This causes a so-called "band-limited" artifact in audio 
compression |2|. Bandwidth extension (BWE) (3j and spectral band replication (SBR) [4] have been 
developed to solve this problem. They are essentially based on synthesizing the missing frequency 
bands; 

• The reconstruction is reduced to zero at a rate of zero, even if a probabilistic model of the source 
is known to the decoder. In fact, if a model of the source is available, a reconstruction can be 
synthesized. Synthesis, although it may not be optimal in the rate-distortion sense, can produce 
natural reconstruction. Analysis-synthesis is a common substitute of quantization for low rate coding 
of perceptually relevant signals (see, e.g., |5j, loTl). 

It can be seen that a premise of compressing perceptually relevant signals is the naturalness of the 
reconstruction. Generally, a signal is judged as being natural if it has a high occurrence probability in 
nature. This notion is in line with the widespread belief that the neural processing is adapted to the 
environment [7]. We note that the naturalness can be related to the context, e.g., to judge the naturalness 
of a speech signal, one may consider only how it is compared to natural speech signals. Natural signals 
can be modeled as a probability space. Quantization is a system that operates on this probability space, 
resulting in another that describes its reconstruction. To achieve a general naturalness of the reconstruction, 
it is logical to restrict the two probability spaces to be close. 
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The probability space of natural signals is required to be known a priori. Studies on the statistics of 
natural signals, e.g., natural sound |8] and images can help in obtaining such a probability space. 
From an information theoretical perspective, the probability space of the source, which is assumed to be 
revealed to both the encoder and the decoder, can be utilized. Supposing that the source is natural, it 
forms a subspace of the probability space of natural signals. Thus preserving the source probability space 
can fulfill our goal of ensuring the naturalness of the reconstruction. An advantage of adopting the source 
probability space is that no additional definitions are needed than those used in classic rate-distortion 
theory. 

With conventional quantization, the probability space of the reconstruction generally differs from that of 
the source. This deviation is not only an implementational limitation but is often a theoretical necessity. 
Quantization has its roots in rate-distortion theory (9), which, among many things, defines the rate- 
distortion function (RDF) that provides the optimal achievable rate-distortion trade-off for any quantizer. 
In general situations, the reconstruction that achieves the RDF forms a different probability space from 
the source. This is reflected by the following facts: 

• The optimal reconstruction is discrete-valued for many sources at a certain squared error distortion 

ED; 

• The source is the sum of the reconstruction and a quantization noise that is independent of the 
reconstruction, when the Shannon lower bound is tight; 

• When the rate is zero, the reconstruction becomes the mathematical expectation of the source for 
the minimum mean squared error. 

These results are consistent with the aforementioned facts of practical quantizers, implying that the 
problem of conventional quantization in altering the source probability space needs to be solved on a 
theoretical level. Since statistical differences are, per se, measures on probability measures, it can be 
difficult, if at all possible, to achieve the preservation of the source statistics in classic rate-distortion 
theory by choosing a sample-based distortion measure. An alternative approach is to impose constraints 
on the probability measure of the reconstruction. 

An early attempt of introducing constrains on the reconstructed statistics to quantization is moment 
preserving quantization ifTTTl . The preservation of certain statistical moments turned out to be advantageous 
in the context of image coding fPZ l. However, moment preserving quantization has some limitations: 1) 
it cannot preserve non-moment-like statistical properties, e.g., the continuous range of the sample values; 
2) it is based on arranging space partition and reconstruction points, so is limited by the rate of the 
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quantizer; and 3) its performance depends on a specific choice of the moments to be preserved and is, 
therefore, difficult to analyze. 

In this article, we consider a new class of quantization, namely distribution preserving quantiza- 
tion (DPQ). DPQ preserves the probability space of the source, by which all statistical properties are 
maintained. Instead of manipulating the parameters of any particular quantizer, DPQ uses an ensemble 
of quantizers, which, as a whole, achieves the preservation of the source probability space. Such a 
construction facilitates DPQ with no restriction on the rate. Moreover, the preservation of the probability 
space facilitates analysis. 

DPQ provides a link between conventional quantization and synthesis. In the zero-rate situation (omit- 
ting model description), the reconstruction has to be generated in the same manner as the source. When 
the rate is higher, less synthesis is needed and DPQ can become more like conventional quantization, 
which at high rates already preserves the probability space of the source in some senses. A key feature 
of DPQ is that it can achieve a seamless transition from one technique to the other. In particular, this is a 
natural outcome of optimizing a rate-distortion trade-off on top of the preservation of probability space. 

The authors of this article have proposed practical DPQ schemes in lfT3l . |[T4l . which have shown 
a superior performance over conventional quantizers in audio coding. This article is dedicated to some 
theoretical aspects of DPQ. In particular, we study an amended rate-distortion theory that serves as a 
guideline of DPQ. The main contributions of this article can be summarized as follows: 

• a formal definition of DPQ (Section HIT): 

• a lower bound of the rate-distortion performance of DPQ schemes, namely the distribution preserving 
rate-distortion function (DP-RDF), and its properties (Section UTTb; 

• an asymptotically optimal DPQ scheme and its properties (Section HVT); and 

• a proof of the achievability of the DP-RDF for Gaussian distribution and the mean squared error 
(Section E). 

II. Definition of DPQ 

With conventional quantization, the reconstruction can form an arbitrary probability space, which does 
not guarantee a perceived naturalness. A solution is to confine the probability space of the reconstruction 
to be identical to that of the source, which is the essence of DPQ. An alternative is to relax the identity 
by putting a constraint on a measure of two probability spaces. One example can be found in lfT3l . where 
a Kullback-Leibler divergence is used as such a measure. However, we impose the two probability spaces 
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to be equivalent in this article. A benefit is that no more mathematical entities are needed than those 
used in classic rate-distortion theory. 

The probability space of the source and the reconstruction are denoted as (A, s/, p) and (A, si, fi), 
respectively. Here A is a sample space consisting of all realizations of the source, si is a a-algebra 
consisting of subsets of A, and \i is a probability measure on si; A, si, and ft are defined similarly. 
Quantization can be described as a mathematical structure that links the two probability spaces. 

Conventional quantization is defined as a mapping from A to A. If DPQ is also defined as a mapping, 
it must be a measure-preserving transformation. However, a measure -preserving transformation does 
not facilitate data compression, since the entropy is invariant. To obtain a feasible definition for DPQ, 
stochastic codes must be introduced. According to Billingsley |TT31 . a stochastic code is a channel, in 
which the key component is a conditional probability measure <p defined on the Cartesian product of A 
and si, denoted as A x si. The conditional probability measure 4>, together with the probability measure 
of the source p, induces a source -reconstruction joint probability space (Ax A, si x si,p), where 



J G 

for any G G si and G E si ' . It further determines the probability measure of the reconstruction, i.e. p, 

as 



By choosing a proper <p, it is possible to achieve p(G) = [i(G),VG G si, thus fulfilling the requirement 
of DPQ. 

Although most of the existing quantization methods are deterministic, stochastic codes do exist in 
practice. An example is the dithered quantization [16], for which a dither is generated by a random 
number generator, added to the source and subtracted from the output of a quantizer, yielding a final 
reconstruction. With a proper dither, the dithered quantization is statistically equivalent to a channel with 
an additive noise [17], i.e., <p(a,G) = e{n : a + n G G}, where e is the probability measure of the 
additive noise. 

However, defining a DPQ as a stochastic code is not constructive, i.e., it does not lead to a practical 
encoder and decoder. A better definition resorts to a quantizer ensemble. A quantizer ensemble is a 
probability space (Q,£},tp), where Q consists of measurable mappings from A to a countable subset of 
A, £} denotes a cr-algebra of subsets of Q, and tp is a probability measure on J2. A quantizer ensemble 
is independent of the source. It can be seen that Q consists of quantizers within the classic definition. 




(1) 



jl(G) = p(A,G). 



(2) 
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For each use of the quantizer ensemble, a quantizer in Q is randomly selected according to tp to perform 
the quantization. We note that a conventional quantizer is a special quantizer ensemble. 
A quantizer ensemble is a stochastic code, incurring a conditional probability measure: 



for any a£i and G 6 si '. The probability measure of the reconstruction of a quantizer ensemble can 
then be described by dD and ©. The randomness of a quantizer ensemble facilitates flexibility of the 
statistical properties of the reconstruction. 

Although the quantization is a deterministic operation given the selected quantizer, observers outside 
the ensemble have no knowledge about the selection, so perceive it as operating stochastically. Figure Q] 
illustrates a typical source coding scenario, where the quantizer is split into an encoder and a decoder, 
both of which can utilize some randomness that is unknown to the observer. A synchronicity between 
the encoder and the decoder, if needed, can be achieved by using pseudo-random number generation. 
We note that such synchronization mechanism is not always needed. It is true that the encoder and the 
decoder should jointly behave as a stochastic code, but either of them can be deterministic. It is also 
possible that they both are stochastic, but have independent randomness. 

Based on the notion of a quantizer ensemble, we can now define DPQ. 

Definition 1 (DPQ): Distribution preserving quantization is a quantizer ensemble, for which the prob- 
ability space of the reconstruction is identical to that of the source. 

As mentioned, an essential consideration for DPQ is the rate-distortion trade-off, similar as conventional 
quantization. The rate and the distortion are both well defined for conventional quantizers and hence for 
the elements of a quantizer ensemble. We denote D : Q — > [0, oo) and R : Q — > [0, oo) as the distortion 
and the rate of an individual quantizer in a quantizer ensemble. For a particular quantizer q € Q, the 
distortion D is the expectation of a distortion measure e : A x A — > [0, oo), i.e., 



J A 

All individual quantizers in a quantizer ensemble share the same distortion measure, and the distortion of 
the quantizer ensemble for that distortion measure is defined as the expected distortion of an individual 



<f>(a,G) = iP{q:q(a)eG} 



(3) 




(4) 



Fig. 1. A typical source coding scenario with a quantizer ensemble. 



quantizer, i.e., 



D= f D(q)d^(q) 




Ue{a,q(a))dip(q)diJ,(a) 




e(a, b)dp(a, b). 



e(a, b)d(j)(a, b)dfi(a) 



(5) 



In addition, each individual quantizer is associated with a uniquely decodable code. The expected 
codeword length defines its rate. The rate of the quantizer ensemble is defined as the expected rate 
of an individual quantizer, i.e., 



JQ 

In the following, we give two examples of DPQ. 

A. Simple Example of DPQ 

A trivial DPQ scheme is to generate reconstructions according to the probability measure of the 
source but statistically independently of the source. To describe it formally, we let Q be all single-image 
mappings: Q = {q : q(a) = q(b), Va, b G A} and 



for any a G A and G G stf. Applying (OQ), ® and ((TJ), We can verify the identity of the source and the 
reconstruction in terms of the probability structure and the independence between them by 




(6) 



^{q : q(a) G G} = fi(G) 



(7) 



p(G,G) 



L 



tp{q : q(a) G G}dp,(a) 




fi(G)dn(a) = /J,(G) /J,(G). 



(8) 
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Obviously, the rate of this DPQ scheme can be made to zero. An interesting fact of this scheme is 
that it describes the principle of synthesis-based reconstruction. Provided a source model, such a method 
generates reconstructions without the need of additional information about any particular realization of 
the source. The model may involve some transmission but is assumed in the context of rate-distortion 
theory as a priori knowledge about the source probability space. 

A pitfall of this simple DPQ scheme is that it leads to a fixed distortion. It is desirable that the distortion 
decreases when the rate for describing a particular source realization increases. 

B. DPQ Derived from Any Quantizer 

Another implementation of DPQ can be obtained by extending any conventional quantizer q$, which 
maps A to countable subset Aq C A, to a quantizer ensemble and assigning a proper measure to it. 
Specifically, given an output of q$, the quantizer reconstructs the source by randomly sampling among 
the values that can result in the same output, according to the relative probabilities of these values. A 
special case of this methodology can be found in |[T3ll . In the language of the quantizer ensemble, this 
DPQ consists of Q = {q : qo(q(a)) = Qo(a),Va € A}. Defining (a) = {a : qo(a) = a, a £ A} for 
a € Aq, we can write the probability measure of the quantizer ensemble as 

1>{q ■ q(a) eG}= ^ (<?o(a))} , (9) 

{ n{q \q (a))} = 

for any a E A and G G £/. We verify the probability space preservation of this scheme by showing 

/2(G) = p(A,G) = f \p{q: q(a) G G}d^a) 
J A 

E v{q \a)nG} 

aeA ,At{g ( 7 1 (a)}^0 

= (10) 

With this DPQ scheme, one can compromise between the rate and the distortion. However, from 
heuristics we may find that the distortion of this scheme can be relatively large. In terms of the mean 
squared error (MSE), this DPQ loses 3 dB against q$ ||T3l at the same rate. 

In lfl4l . a DPQ that achieves a better performance was proposed. An obvious question is: what is the 
optimal trade-off between the rate and the distortion for DPQ and how to achieve it? A large part of this 
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article is devoted to answer these questions. Before moving to this discussion, we define the scope of 
this article. 

C. Scope of This Article 

The definition of DPQ given earlier is based on the probability space, which makes it suitable for 
sources with an abstract alphabet. It is possible to follow such a notion for further discussion of DPQ, 
similarly as the treatment of lPT8l and (£l Chapter 7]. However, in the context of practical quantization, the 
sample space of the source mostly refers to the /c-dimensional Euclidean space R k with some k, and the 
source is then known as a random vector X, which consists of k random variables (r.v.). The probability 
measure of such a probability space can be fully described by a probability distribution function of the 
random vector, Fx, which is also known as the cumulative distribution function (c.d.f.). Confined to the 
language of random variables, we can define a quantizer ensemble as a bivariate function X = q(X\&), 
where X denotes the reconstruction and is an auxiliary random vector that governs the selection of a 
quantizer for a use of the ensemble. The is independent of X. The stochastic code incurred by such 
a quantizer ensemble can be described by a conditional probability distribution function F x , x . 

On quantizing a sequence of random vectors, DPQ needs to preserve the joint probability distribution 
of the entire sequence. This article mainly deals with the DPQ for sequences that are comprised of 
independently and identically distributed (i.i.d.) random vectors. For such a source, DPQ aims to preserve 
the marginal probability distribution of each random vector and the independence among the random 
vectors. We note that, if the marginal probability distribution of a random vector in the sequence is 
preserved by one use of a quantizer ensemble, and the uses of the quantizer ensemble on different 
random vectors are independent, the quantizer ensemble is a DPQ for the entire sequence. 

III. Distribution Preserving Rate-Distortion Function 

The RDF plays an important role in lossy source coding. It gives a guideline of the minimum rate 
that any quantizer can achieve, subject to a constraint on the distortion between the source and its 
reconstruction. Here we define a similar function for DPQ. The function is referred to as the distribution 
preserving rate-distortion function (DP-RDF). It serves as a lower bound of the achievable rate of any 
DPQ scheme under a constrained distortion. 

Definition 2 (DP-RDF): The distribution preserving rate-distortion function for probability distribution 
Fx and a distortion measure e is defined as 

R DP (D)= inf I(X;X), (11) 
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where 



HD) = { 



F x]x :E{e(X,X)}<D,F x (x) 




(12) 



Next we show that the DP-RDF is a lower bound for DPQ. To get there, we first show that 

Lemma 1: The DP-RDF is a non-increasing convex function. 

Proof: To prove that the DP-RDF is non-increasing, we consider any < D 2 < D\. It follows that 
F(D 2 ) C F{Di) and therefore, Rt>p{D 2 ) > Rdp(Di). 

To prove the convexity, we assume that conditional probability F\ achieves Rbp (D\ ) and F 2 achieves 
Rw(D 2 ). For any < A < 1, let D = \D\ + (1 — \)D 2 . We consider a conditional probability 
F = AFi+(l-A)F 2 . It can be seen that F € F{D), so with X and X induced by F, Rdp{D) < I(X; X). 
Since the mutual information is a convex function of a conditional probability function (see, e.g., |[T9l 
Theorem 2.7.4]), we also find that I{X;X) < XR DP (Di) + (1 - X)R DP (D 2 ). So the DP-RDF is a 
convex function. ■ 

Then we show that DP-RDF serves as a lower bound for DPQ in the following proposition. 

Proposition! (DP-RDF is a lower bound for DPQ): Consider k i.i.d. random vectors X\,--- ,X^ 
(denoted as Xk), each of which follows probability distribution Fx- Given any DPQ scheme Xk = 
q(Xx\Q), we consider a single-letter fidelity criterion e^, which is derived from a distortion measure e 
as 



If the expected fidelity of the DPQ satisfies K{ek(X, X)} < D, the per-dimension rate of the DPQ must 
be greater than or equal to the DP-RDF for Fx and e. 



k 




(13) 



i=i 
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Proof: The proof is analogous to the proof of the converse of source coding theorem (see, e.g., |[T9l 
Chapter 10.4]). The critical steps of the proof are the following: 

> k^mXxlQ) (14) 
= AT 1 [h(X k \@) - H(X K \X K , G) 

,-i 



AT 1 h(X K \&) - h(X K \X K , Q) 



>k- 1 (h(X K ) - h{X K \X K ) 



= k- 1 i{x K -x K ) 

k 

yk-^HX;^) 

i=l 
k 

^AT^iJcp^jepQ,^)}) (15) 
i=l 

>i?DP L-^Kfax^Xi)}^ (16) 

> Rdp(D), (17) 

where (fl4b is due to the fact that the rate of each individual quantizer in a quantizer ensemble is greater 
than or equal to the entropy of its reconstruction; ( fT5T ) holds because DPQ preserves the joint probability 
distribution of Xk and hence must preserve the marginal probability distribution of each random vector, 
so I(Xf, Xj) must be bounded by the DP-RDF; (fT6l ) is based on the convexity of the DP-RDF; and (fTTT ) 
exploits the monotonicity of the DP-RDF. ■ 
For a general source and distortion measure, whether the DP-RDF can be achieved by a DPQ scheme 
is an open problem. However, we will show in this paper that the DP-RDF for a Gaussian distribution 
and MSE is achievable. 

In the remainder of this section, we derive the DP-RDF for Gaussian distributions and MSE, then 
compare this DP-RDF to the corresponding RDF. We also try to link the DP-RDF to the RDF for general 
sources and distortion measures. 
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A. DP -RDF for Gaussian Distributions and MSE 

Similarly to the RDF, it is generally difficult to obtain the DP-RDF analytically. However, the Gaussian 
source with MSE is one of the cases that the DP-RDF is derivable. In lossy source coding, such a source 
and distortion measure is usually of particular interest. 

Proposition 2: The DP-RDF for a Gaussian distribution and MSE is 

I 2i r D<2a\ 

R Dp (D) = { (°ID-Dy4)? X j (18) 




D > 2o\ 

where o\ represents the variance of the Gaussian distribution. 

Proof: Let X be an r.v. that follows the same probability distribution as X. Since the mean and the 
variance of X equal those of X, we find 

D = E|(X-X) 2 | 

= E {(X - fix) 2 } + E {(X - fi x ) 2 } - 2E {(X - fi x ){X - fi x )} 

= 2<j 2 x-2E{(X-fi x )(X-fix)}, (19) 

where fix denotes the mean of X. Therefore the co variance matrix of the joint random vector [X, X] T 



1S 



C 



4 a\ - D/2 



(20) 



a 2 x - D/2 a x 

This matrix is positive semi-definite if and only if D < 2a 2 x . Knowing the co variance matrix, the 
differential entropy of a random vector is upper bounded |[T9l Theorem 8.6.5]. Using this property and 
the fact that the differential entropy of X equals that of X, we obtain 

I(X; X) = h(X) + h(X) - h(X, X) 

> log (2-Keajc) - log (2nedet(C)2 
= log a\ - log (a 2 x D - D 2 /A) 3 . (21) 

The equality is achieved when [X, X] T is jointly Gaussian distributed. It is easy to show that this 
condition is fulfilled without violating the preservation of the source probability distribution. We hence 
have verified (fT8l ) for D < 2a\. Then using the non-increasing property of the DP-RDF, we can verify 
it for D > 2a\. ■ 
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MSE 



Fig. 2. The RDF and the DP-RDF for the standard Gaussian distribution and MSE. 



We compare the DP-RDF to the RDF for the same Gaussian distribution and MSE, which is 

R(D) = { 2 6 D X . (22) 

I D>a 2 x 

Figure [2] illustrates both functions. It can be seen that the minimum MSE of the DP-RDF is 2a x when 
the rate is zero. This is an achievable rate-distortion pair for DPQ, and the simple DPQ scheme in 
Section Hl-AI naturally achieves it. For the RDF, the minimum MSE at zero-rate is o\, half of that in 
the case of the DP-RDF. The gap between the DP-RDF and the RDF is simply because DPQ randomly 
generates a reconstruction according to the source probability distribution, while an MSE-optimized 
quantizer outputs the mean of the source, when the rate is zero. In general, the requirement of probability 
distribution preservation increases the distortion. This loss, however, can vanish at high rates. For a 
Gaussian distribution and MSE, we see that 

lim Rbp(D) - R(D) = Hm \ log = 0. (23) 

u X 4 

The behavior of the DP-RDF at low rates and high rates implies that the optimal DPQ forms a transition 
between synthesis and conventional quantization. 

Proposition [2] also leads to a conceptually optimal construction of DPQ for a Gaussian r.v. and MSE, 
which is given by the following corollary. 

Corollary 1: For a Gaussian r.v. X with mean fix and variance a x , consider another Gaussian r.v. N 
that is independent of X and has zero-mean and variance cr^. The following r.v., 

( 1°* r V (X-»x + N) + n x , (24) 

has the same probability distribution as X and the mutual information between X and X achieves the 
DP-RDF for X and the MSE between X and X. 
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Proof: First, given the fact that N is independent of X and is Gaussian distributed with zero-mean 
and variance a 2 N , it is clear that X follows the same Gaussian distribution as X. Then with simple 
algebra, one can obtain the MSE between X and X as 



D = E- 



{(X-X^^-f-^y). (25, 



Finally, the mutual information between X and X can be written as 



R = I(X- X) = I(X- X + N) = - log x \ N = log * r . (26) 

Comparing (1261 ) to Proposition |2j we verify Corollary [TJ ■ 
Corollary Q] indicates that if there is a quantizer that operates like an additive white Gaussian noise 
(AWGN) channel and has a rate equal to the capacity of the channel, an optimal DPQ for a Gaussian 
r.v. and MSE is such a quantizer followed by a shifting and a scaling. It is known that entropy coded 
dithered lattice quantization (ECDQ) behaves effectively as a channel with additive noise and the rate 
equals the channel capacity |[20l . Unfortunately, when ECDQ has finite dimensionality, the quantization 
noise is not Gaussian. However, a DPQ scheme can be obtained by applying a non-linear transformation 
after an ECDQ. This approach will be discussed later in this article. 

We have compared the DP-RDF and the RDF for a Gaussian distribution and MSE. We now try to 
analyze their relationship for more general sources and distortion measures. 

B. Relationship between DP-RDF and RDF 

The relationship between the RDF and the DP-RDF is usually not as straightforward as in the case of 
Gaussian distributions and MSE. However, we will show that for a broad class of sources and distortion 
measures, the DP-RDF approaches the corresponding RDF when the rate increases. 

It is known that the RDF equals the Shannon lower bound (SLB), when the source and its reconstruction 
are related by a "backward channel" with additive noise Q. From the reconstruction that achieves the 
SLB, we construct a "forward channel" with the same noise statistics as for the "backward channel". Then 
the output of the forward channel follows the probability distribution of the source and hence defines 
an upper bound on the DP-RDF. This upper bound can be related to the SLB. Figure [3] illustrates the 
relationship of a source X, an SLB achieving reconstruction X, a distribution preserving output X, and 
the noise of a backward and a forward channel, denoted by W and W, respectively. 

The SLB is defined for a difference distortion measure as 

Rslb(D w ) = h(X) - sup h(W). (27) 

:E{e(W)}<D w 
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W 



W 



X 



X 



X 



Fig. 3. A backward-forward channel that preserves the probability distribution of the source. In this model, X is independent 
of W and W, The probability distribution of W equals that of W. 



When the distortion measure satisfies e(w) = e(—w), the p.d.f. of W is symmetric. Then letting W = —W 
suffices for the backward-forward channel in Figure [3] to preserve the source probability distribution. In 
this case, the mutual information between X and X follows 

I(X;X) = h(X) + h(X) - h(X,X) 
= 2h(X) - h(X - X,X) 
= 2h(X) - h(X -X)- h(X\X - X) 

= 2h(X) -h(2W) -h(X -W), (28) 
where ( f28l > stems from the fact that W and X — W are independent. The distortion between X and X is 

D = E{e(X - X)} = E{e(2W)}. (29) 

This rate-distortion characteristic forms an upper bound on the DP-RDF. To relate it to the SLB d2"7T ), 
we need to investigate the effect of scaling the noise on the SLB. We consider the following lemma. 
Lemma 2: If a difference distortion measure e satisfies 



e(aw) = c(a)e(w), a > 0, 
the SLB for any source and e satisfies 

RshB{c{a)D) = R SLB (D) - log a. 
Proof: This lemma can be simply proven by 

Rslb(D) = h(X) - sup h(W) 

¥,{e(W)}<D 

= h(X) - sup h(aW) + logo 

E{e(aW)}<c(a)D 

= RsLB(c(a)D) + log a. 



(30) 



(31) 



(32) 
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Now we can provide a relationship between the SLB and the DP-RDF. 

Proposition 3 ( relationship between SLB and DP-RDF): For a source X and a difference distortion 
measure e, if 

1) the distortion measure satisfies e(2w) = ce(w) and e(w) = e(—w), Vw, and 

2) the SLB is tight, and W is the reconstruction error that achieves Rslb{D/c), 
then the DP-RDF for X and e is bounded by 

Rslb(D) < R DP (D) < R SL b(D) + h(X) - h(X - W). (33) 

Proof: The left inequality is trivial: the DP-RDF is larger than or equal to the corresponding RDF, 
which is larger than or equal to the SLB. 

To prove the right inequality, we apply the upper bound of the DP-RDF given by ( [28] ) and d29l ). Using 
Lemma |2] we find 

Rdp(D) < 2h{X) - h(2W) - h(X - W) 

= R SL b(D/c) - log 2 + h(X) - h(X - W) 

= R SL B(D) + h(X)-h(X -W). (34) 

■ 

Since the p.d.f. of W becomes narrower when the distortion approaches 0, h(X) can get closer to 
h(X — W), then Rdp(D) may approach Rslb{D) and hence also the RDF A Gaussian source with 
MSE is an example of this situation. For rigorous conditions of h(X) — h(X — W) — > 0, one may refer 
to |2TH . which also proves that the SLB is asymptotically tight under mild assumptions, as the distortion 
decreases. This implies that the DP-RDF is asymptotically equivalent to the RDF for a large range of 
sources and distortion measures. 

IV. Transformation-Based DPQ 

In lfT4l . a scalar DPQ scheme that uses dithering and a non-linear transformation was proposed. It 
is based on the fact that the preservation of the source probability distribution can be obtained by 
performing a transformation on the output of a dithered quantizer. We refer to such a DPQ paradigm as 
transformation-based DPQ. Here we generalize the idea to a vector DPQ. An extensive analysis on the 
transformation-based DPQ will be conducted. The analysis shows that this scheme has nice rate-distortion 
properties. In particular, it is able to asymptotically achieve the DP-RDF for Gaussian distributions and 
MSE. 
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Fig. 4. Diagram of transformation-based DPQ. 



A. Quantization Scheme 

A transformation-based DPQ, as shown in Figure 01 is a construction based on an ECDQ followed by 
a transformation. The rate of the ECDQ is defined as the rate of the transformation-based DPQ. ECDQ 
has a rate-distortion performance that is close to the RDF [20] and therefore, we can expect that the 
whole DPQ scheme can achieve a low distortion in a rate range where the transformation does not affect 
the signal significantly. 

Let source X be a /c-dimensional random vector. The ECDQ uses a subtractive dither and a k- 
dimensional lattice quantizer The lattice quantizer performs the following operation: 

?l(A) =l n , AG V n , 

where l n and V n represent the n-th lattice point and the n-th lattice cell, respectively. Every cell can be 
defined as a translation on a basic cell P : 

V n = l n + V . 

In the following the volume of Vq is denoted as V, i.e., V = Vol('Po)- The lattice quantizer is used 
together with a subtractive dither. The dither Z is generated according to the uniform distribution over the 
basic cell Vq. It is added to X before the lattice quantization and subtracted from the quantized signal, 
resulting in X. Finally, a transformation g is applied to X, yielding a reconstruction of the source block, 
X. The whole DPQ operation can be written as a bivariate function with the dither being an auxiliary 
variable: X = g(q L (X + Z) - Z\Z). 

With the ECDQ, N = X — X is independent of X and uniformly distributed over — Vq |[22l . Therefore, 
X follows a continuous probability distribution, which can be calculated analytically. 

For a one-to-one mapping X = g(X), the p.d.f. of X becomes 

f x (x) = fx(g~Hx))\<let(3(x))\, (35) 
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where 2(x) is the Jacobian of g 1 (x). Thus a sufficient and necessary condition for the scheme in Figure 
g] to be a DPQ is 

|det(3(s))| = - f , x{ f, , a.e. (36) 

Except for the scalar case, it can be difficult to find a transformation that fulfills this condition. An 
existing method is Rosenblatt's transformation [23], which performs a sequence of transformations on 
a number of continuous r.v.'s to obtain independent r.v.'s that are uniformly distributed over [0,1]. An 
inverse Rosenblatt's transformation can transform independent r.v.'s that are uniformly distributed over 
[0, 1] to r.v.'s with an arbitrary probability distribution, which is known in the field of random number 
generation as inverse transform sampling. 

In the following, we use Xj to denote a random vector that consists of a subset of the r.v.'s of another 
random vector X, where / is the set that contains all chosen indices. The cardinality of I is denoted as 
In addition, we use F x \y and fx\Y to denote the conditional c.d.f. and p.d.f. of a random vector X 
given another random vector Y. 

Rosenblatt's transformation performs the following operations sequentially: 

u x = f Xi {x x ) 



U " - F X n \^ il ,... , n - iy ( Xn \ X {l," ,n-l}) 

(37) 

The result of the transformation is that U\ , • • • , £/& are independently and uniformly distributed over 
[0, 1]. We then perform an inverse Rosenblatt's transformation as follows: 

X l = F x ]{U l ) 



Xn ~ F xl\X il ,.., n _ iy ( U n\ U {l,-,n-l}) 

(38) 

where the inverse c.d.f. follows a standard definition, i.e., F x 1 (x) = inf{x : Fx(x) > x}. In fact, for 
the inverse transformation d38l ), reordering U\, • • ■ ,Uk does not influence the probability distribution of 
X. However, we consider this particular order, since it yields a small Euclidean distance between X and 
X. It will be shown that DPQ with this transformation leads asymptotically to the optimal rate-distortion 
performance in certain circumstances. 
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Summing up (l37l) and (l38l) we may write an overall transformation for the proposed DPQ scheme: 

9i{x) = F x!\x {1 , ...,*_!} (^,11,!,...,,^, (£*!%••• M £ ),' ' ' ,9i-i{x)) , i = 1,- • • ,k. (39) 

From the properties of Rosenblatt's transformation, we can see that the probability distribution of the 
transformation output equals that of the source. This can also be verified by checking that (l39l) fulfills 
(l36l) . Therefore we can claim: 

Proposition 4: The proposed scheme in Figure |4] with the transformation defined by (l39l is a DPQ. 

Here we briefly consider the behavior of the transformation-based DPQ at low rates and high rates, 
respectively. When the rate is low, the output of the ECDQ has a near-uniform probability distribution, 
which is reformed by the transformation to a desired shape. At high rates, the output of the dithered 
quantization has a probability distribution that resembles that of the source. Then the transformation 
modifies the ECDQ output only slightly. We will show later that, as the rate increases, the modification 
becomes so small that it does not increase the distortion that is introduced by the ECDQ. 

In the following, we will make an extensive analysis on transformation-based DPQ. The analysis deals 
with its general properties and asymptotic properties w.r.t. high rates and high dimensionality, respectively. 

B. Properties of Transformation-Based DPQ 

We are interested in the amount of modification that the transformation d39l introduces. If g(x) always 
falls in a vicinity of x, the transformation-based DPQ will have a similar rate-distortion performance as 
ECDQ. 

The transformation (|39l is non-linear and seems difficult to analyze. However, the transformation has 
a special structure, i.e., it consists of an inner and an outer function that are closely related. Therefore, 
some properties exist, which facilitate an analysis on the transformation-based DPQ. 

We first show a property of ECDQ using the following lemma. 

Lemma 3: For a A;-dimensional ECDQ with input X and output X, given any realization x of X, and 
any two disjoint subsets / and J of {1, • • • , k}, there exists an 

x G x + Vq, 

such that 
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1 

V 
1 

V 



£l f Va fXr,j (V+T I ,XJ+ Tj)dT 
-oo 

I Vo fXj{xj + Tj)dT 

Ivo I-oc fxtjv + r ^ f J + Tj)dvdr 

Ivo fXj(xj + Tj)dT 
I Vo F Xr\Xj(%I + t i\xj + Tj)f X j(xj + Tj)dT 



Proof: In ECDQ, the quantization noise, X — X, is independent of the input and uniformly distributed 
over — Vo. So the p.d.f. of X is 

fx( & ) = Trl fx{x-r)dr. (40) 
Let K = {1, • • • , k}, the marginal p.d.f. of Xj is 

f x J ( £ j)= i fx J , K v@J> v ) dv 

fxj,K\A% J + T J> V + T K\j)drdv 

77 f x {xj + Tj,V + T K \j)dvdT 

V JVa JM, k -\ J \ ' X 

= \- r \ fxAxj+Tj)dT. (41) 

V JVo 

Then the conditional p.d.f. of Xj given X j writes 

_ J Po fx T ,j(xi + Tj, Xj + Tj)dT 
I V JX.,{XJ + Tj)dT 

The conditional c.d.f. can be derived as 

/XI 
f Xl \xM & j) dv 
-oo 



(42) 



(43) 



I Va fxA x J + Tj)dT 

Because F Xl \x, i s a continuous function and fx, is nonnegative, Lemma [3] follows from the mean value 
theorem of integration. ■ 

Lemma [3] implies that, for any i-th step of the transformation (|39l ), if one is free to choose g\ (x), ■ ■ ■ , gi-\ ( 
the result of the transformation is almost surely bounded in the Vq vicinity of x. Unfortunately, due to the 
sequential nature of the transformation, gi(x), ■ ■ ■ are fixed for the i-th step of d39l ), thus there 

is no guarantee of a bound on the result of the transformation. However, when the source is composed 
of independent r.v.'s, the influence of the sequential treatment is less severe. We find the following 
proposition. 



20 



Proposition 5: For a fc-dimensional ECDQ with input X and output X, if X is composed of inde- 
pendent r.v.'s, then given any realization x of X, there exists an 

x ex + T(P ), 

with T(Vq) defined as a box that covers the basic quantization cell: 



T(Vo) = <v : inf n < v { < sup r, > , 

such that 

FxM) = F x i \x ili ..., 4 _ I} N%,-,i-i}) 

holds for i = 1, 2, • • • , fc, simultaneously. 

Proof: According to Lemma |3l for any x and i, there is an 

G x + V Q , (44) 



such that 



F Xi\X { i, 



_ 1} ■ (45) 



It is easy to see that 



We take 



inf Ti < x ■ — Xi < sup Tj. (46) 



X = (#,••• ,4 fe) ) » ( 47 ) 

which proves Proposition [5] ■ 
Proposition |5] implies that, when the source is comprised of independent r.v.'s, the transformation (l39l ) 

does not move its input far. So the transformation-based DPQ can have a comparable rate-distortion 

performance to the embedded ECDQ. Proposition [5] also implies the robustness of transformation-based 

DPQ, i.e., even if the probabilistic model does not match the input data well, the reconstruction of the 

transformation-based DPQ can still be bounded. 

However, when the dimensionality approaches infinity, Proposition [5] can become less meaningful, 

since the covering box may become unbounded. For high dimensionality, the transformation has some 

additional properties that will be considered in Section |IV-Dj 
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C. Asymptotic Properties w.r.t. High Rates 

We have analyzed the transformation at any rate. The results show that, for a source of independent 
r.v.'s, the transformation performs a mild change to the ECDQ output. In the following, we further consider 
a high rate scenario, for which the MSE of transformation-based DPQ approaches that of ECDQ. 

Proposition 6: Let X be a source random vector consisting of independent r.v.'s, each of which has a 
c.d.f. and an inverse c.d.f. whose second derivatives are bounded almost everywhere. Assume the basic 
cell Vq of the lattice used in the transformation-based DPQ is symmetric w.r.t. each of its coordinates, 
meaning if 

(n,--- ,r k ) eV , (48) 

with any 1 < i < k, then 

(n,--- ,-ri,--- ,r fc ) ev . (49) 

Then, the MSE of the transformation-based DPQ and that of the embedded ECDQ satisfy 

e|||X-X|| 2 | =e|||X-X|| 2 | +0(Vi). (50) 

The proof of Proposition [6] resorts to the technique of Taylor series and is given in Appendix. 

Due to Proposition [6l for independent r.v.'s and MSE, transformation-based DPQ performs equally 
efficient as ECDQ with an increasing rate. In addition, because the optimal ECDQ can asymptotically 
achieve the RDF for i.i.d. Gaussian source and MSE as the rate and the dimensionality increase [20], 
the transformation-based DPQ can also asymptotically reach the RDF and hence the DP-RDF. Moreover, 
it will be shown later that, for i.i.d. Gaussian sources and MSE, the transformation-based DPQ can 
asymptotically reach the DP-RDF at any rate as the dimensionality increases. To get there, we will first 
investigate the behavior of the transformation at high dimensionality. 

D. Asymptotic Properties w.r.t. High Dimensionality 

In this subsection, we consider the asymptotic behavior of the transformation when the dimensionality 
increases. In particular, with a certain sequence of lattices, the shape of the basic cells can approach a 
ball, and the transformation d39l ) can become simpler, especially when the source r.v.'s are independent. 

Let Bk{r) denote a fc-dimensional ball with radius r, and B k (r) denote its volume: 

— k 

B k (r) = Yo\(B k (r))= J° r (51) 

r(| + i) 
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The following lemma considers the average of a function in a ball when the ball's dimensionality is 
large. The lemma is inspired by Poincare's observation that, if a /c-dimensional random vector follows a 
uniform distribution on a sphere (the surface of a ball), then any finite subset of the r.v.'s of the random 
vector follows an i.i.d. Gaussian distribution, when k — > oo. A proof of this can be found in [24]. We 
here consider the average of a function in a ball, which should make no significant difference from its 
average on the surface of the ball, since a thin shell located at the surface of a ball takes all the volume of 
the ball when the dimensionality approaches infinity, which is known as sphere hardening. The statement 
and proof of our lemma are different from the mentioned work and are shown below. 

Lemma 4: For any integer set / with finite cardinality, any function / : IRl 7 ! — > R and any i] > 0, the 
following holds 

fon ~ ,)i - / , f{n)dr=f /(r)(2^ 2 )-^exp^f dr. (52) 
k ^°° B k (k 27]) J B k {k^rf) Jm.w 2 V 

Proof: Because the intersection of a ball with a hyper-plane is a ball of lower dimensionality, it can 
be shown that 

1 f f B k _ m f(k V 2 - \\rif) 

— / f( Tl )dT = / f( Tl ) ; 

B k (kiT]) JB k (ki v ) ■/||ri|| 3 <fe?7 2 B k (k2T]) 

Further, we find 



(53) 



B k -m {(k^-WnW 2 )- 

B k (k2r]) 

T ( k ~W + l\ 7T2(k2r)) k 



Using the property of the ratio of two Gamma functions (see, e.g., E5l Equation 6.1.46]), we can obtain 

nm £ii±M±ir! = j (55) 



Also, it is easy to find that 



k^co r /| + 1 + Ml 



( \Vif\~ -||t/|| 2 
^y-^f) =exp ^^- (56) 



Thus Lemma |4] is proven. ■ 
Lemma @] indicates that the average of a function in a high dimensional ball can be calculated in a 
smaller space. If the lattice cells used in the transformation-based DPQ are balls, we may use Lemma 
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[4] to derive the transformation (|39l for high dimensional situations, since the conditional c.d.f. in 
which is given by (|43T >. is based on the average of a function over the basic cell. Unfortunately, lattice 
cells cannot exactly be balls. However, a sequence of lattices can approach to a ball in various senses 
1126*1 . ll22l . lf27l . We now show that by using a sphere-bound-achieving lattice sequence lf2"7ll . the average 
of a bounded function in the basic lattice cell follows the same behavior as in Lemma [4] 

Lemma 5: For any integer set I with finite cardinality, any bounded function / : Rl J l — > [-M, M] 
with some M > and any r] > 0, there exists a sequence of lattices with increasing dimensionality such 
that 

lim 1. f f(r I )dr= f /(r)(2vrr / 2 )-^ exp ^f- dr. (57) 

k^oo Vk J-pW J w m 2rj z 

(k) 

where Vq and V k denote the basic cell and its volume of the k-th lattice. 

Proof: We use a sequence of lattices that is sphere-bound-achieving. In particular, the volume of the 
basic cell satisfies 

V k = B k (khn), (58) 

and the probability that a /c-tuple Gaussian vector, which consists of i.i.d. Gaussian r.v.'s with zero mean 
and variance rj 2 , falls outside approaches 0, when k — > 00. 
We observe 

±r ( f(n)dT = \ / f( Tl )dT 

+ K— [ f(rj)dT 

Bk(k^rj) Jv { k) \B k {kh v ) 



1 



Using the fact that / is bounded, we see 

1 



f(Tl)dT. (59) 



< — ¥■ x ' M. (60) 



B k (k*ri) JB k {khri)\p { 
f(ri)dr 



B k (k2r]) 



B k (k2 V ) JB k {kh v )\V™ 

When k — > 00, the A; -tuple Gaussian random vector is uniformly distributed in the ball. The sphere- 
bounding-achieving condition implies the following E71 : 

lm VoHB k (kh)\vt) =0 (61) 
k -+°° B k (k2rj) 



Thus 



lim [ /(r/)dr = 0. (62) 

k^oo B k {k*rj) JB k {kh n )\vl, k) 
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(63) 



and hence 



lim 



f(ri)dT = 0. 



(64) 



Then using Lemma HJ Lemma [5j is proven. 



According to Lemma [51 we can see that the calculation of the conditional c.d.f. of X (l43l) and 
therefore the transformation 091 ), in the limit, may not require an integration over the whole basic 
cell. In particular, when X is composed of independent r.v.'s, the transformation 091 can be significantly 
simplified. Consider a sequence of transformation-based DPQs, which uses the lattice sequence defined 
in the proof for Lemma [5] Let g^\x) be the i-th step of the transformation of the A;-th DPQ. When the 
source consists of independent r.v.'s, whose p.d.f.'s are bounded, for any particular i, it follows that 



where I denotes {1, • • • , i — 1}. Eq. (l65T ) is valid only if the choice of i is independent of k. However, 
in the transformation 09l ), i goes from 1 to k. To make the number of steps in the transformation 
independent of the dimensionality of the transformation-based DPQ, we may increase the dimensionality 
of a source vector by appending it with pseudo-random numbers. In this way, we can quantize a k- 
dimensional source vector with a DPQ of an arbitrarily large dimensionality. In this setup, the asymptotic 
behavior of the transformation, i.e. d65l ). is valid for all the steps in the transformation. 

V. ACHIEVABILITY OF DP-RDF FOR GAUSSIAN DISTRIBUTIONS AND MSE 

Based on the high dimensionality analysis of transformation-based DPQ, we can show that transformation- 
based DPQ can achieve the DP-RDF for i.i.d. Gaussian sources and MSE, at any rate. We propose 

Proposition 7: Given a source consisting of i.i.d. Gaussian r.v.'s, let Rdp(D) denote the DP-RDF for 
the Gaussian distribution and MSE. For any distortion level D > 0, there exists a sequence of DPQs with 
increasing dimensionality, such that the rate approaches Rbp(D), while the MSE approaches a level that 
is smaller than or equal to D. 

Proof: Denote the mean and the variance of the Gaussian distribution as [ix and a\. WhenD > 2cr|, 
Rbp(D) = 0, Proposition|7]for such a situation can be fulfilled by using the simple DPQ scheme described 
in Section III-Ai Therefore, we only need to consider the case that < D < 2a 2 x . 




(65) 
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Using the lattice sequence in the proof for Lemma \5\ we obtain a sequence of transformation-based 
DPQs. Then according to (1631 ). we can show that the transformation for k-th DPQ satisfies 



lim g^ k \x) 

k—toc 



^(£(2.(4 + , 2 ))-exp^^,; 




(2vrcr x ) ^exp cfe I 

. , ., , - /'.v! • /'v- (66) 

We notice that this result is very similar to Corollary Q] In fact, by the following discussion, we will show 
that the rate-distortion performance of the transformation-based DPQ indeed approaches the DP-RDF for 
the said Gaussian distribution and MSE. 

It has been shown in [28] that a sphere-bound-achieving lattice sequence can also be good for 
quantization. According to 11221 . for a lattice sequence that is good for quantization and salines the 
volume condition (|58T ). the noise introduced by the ECDQ is white and has a power approaching rj 2 . 
Using the fact that the transformation approaches a linear operation, i.e. (l66l ). we can show that the MSE 
of the transformation-based DPQ satisfies 



D = lim D 



k 




= 24 ( 1 - - ° x , J • (67) 

In addition, using lattice sequence that is good for quantization, the rate of the ECDQ and hence the rate 
of the transformation-based DPQ satisfy |[22l . 



fc-s>oo 2 rj 2 

Finally, through some elementary algebra, we can verify 



1 a % + i] 2 

R= lim R k = - log x ' ■ (68) 



R = log ^ r . (69) 

(4/2 -d 2 /aY 

Therefore, the rate and the MSE of the DPQ sequence approaches a point on the DP-RDF. By choosing 
the value of r], D_ can take any value in (0, 2a\). Then Proposition |7] is proven. ■ 
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VI. Conclusions 



In this article, we proposed distribution preserving quantization (DPQ) as a new lossy source coding 
concept, which aims to achieve a good perceived quality of signal reconstruction for the entire range of 
rates. To this purpose, DPQ optimizes a rate-distortion trade-off under the constraint that the probability 
space of the source is preserved. 

The minimum rate that any DPQ scheme can achieve, under a constraint on the distortion, is lower 
bounded by the distribution preserving rate-distortion function (DP-RDF). In general situations, the DP- 
RDF approaches the classic rate-distortion function on the same source and distortion measure, when 
the distortion decreases. This means that, at high rates, DPQ may perform as well as conventional 
quantization. At low rates, DPQ relies more on synthesis to reconstruct the source, thus maintaining 
good perceived quality. In particular, DPQ facilitates a seamless transition between signal quantization 
and synthesis. 

We also proposed an asymptotically optimal DPQ scheme, namely transformation-based DPQ. This 
scheme is shown to be as efficient as a classic quantization scheme for the mean squared error (MSE), 
as the rate increases. For i.i.d. Gaussian sources and MSE, transformation-based DPQ asymptotically 
achieves the DP-RDF as the dimensionality increases. 

Appendix 
A Proof of Proposition [6] 

Proof: Let x be a realization of the source random vector X, n is a realization of the ECDQ noise 
N, which is uniformly distributed over — Vq and independent of the source. 

Using the fact that the source is composed of independent r.v.'s, the transformation d39l ) becomes 



where I = {1, ••■ , i — 1}. Since we assume that the c.d.f. for each source r.v. has a bounded second 
derivative, using Taylor series, there exists a W such that 




(70) 



F Xi (xi + rii + Ti) - F Xi (xi) - fxi(xi)(ni + n)\ < W{rn + nf . 



(71) 



Then we can find 



L F Xt (xi + rii + Ti)f Xl (xi + nj + Ti)dT 



< F x Xxi)+ei 



(72) 



Ir fxi(xi + nj + Ti)dr 
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where 

71 ■ l / 77 2 T 2 \ 2 

e l = fxM-^-VS + W Mr + sup -V F*. (73) 

F » V F * rep,, V * / 

To obtain d73l . we exploited the condition that "Po is symmetric. Then we show that there exists an M 
such that 

9i {x + n)< F x ] (F Xz ( Xi ) + a) (74) 
< F^FxXxi)) + - J' + Me 2 (75) 

= Xi + (/x i (^)) _1 e i + Me, 2 a.s. (76) 

where d74b uses the non-decreasing property of inverse c.d.f., (l75l) is due to Taylor seders and the bound 
on the second derivative of the inverse c.d.f. for each source r.v., and (l76l ) holds almost surely because 
Xi is a realization of X{. Similarly to the earlier derivation, we can have 

9i{x + n)> Xi + {fx^Xi^Si - Mb} a.s., (77) 

where 

ni i „. / n 2 t 2 



5i = fx i (xi)-rV>-W 4+sup-r VJ. (78) 
Therefore 

(x l -g i (x + n)) 2 <max{((/ Xi (x i ))- 1 e i + Me 2 ) 2 ,((/ Xi (x i ))- 1 <5 i -M5 2 ) 2 } a.s. (79) 

Since iV is uniformly distributed over — Vq, statistical moments of Ni/V * do not depend on V. In 
addition, sup re -p o r 2 /V* does not depend on V. Therefore 

k 

e{||X-X|| 2 } = ^E{(X i -< ?i pT + iV)) 2 } (80) 
i=i 

is bounded by a polynomial of V*. To show Proposition [6l only terms of an order lower than 3 in the 
polynomial are needed. The two terms in the maximization d79l share the same terms with an order 
lower than 3. By picking out these terms, we have 

k 

e{\\X-X\\ 2 } = Y / E{(X i -g i (X + N)) 2 } 



i=i 
k 

< ' 

8=1 



^E{iv 2 + o(y!)} 

8=1 

e|||X-X|| 2 } +0(vi). (81) 
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